HIGHER-ORDER MOMENTUM DISTRIBUTIONS AND LOCALLY 
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Abstract. To achieve sparse paramctrizations that allows intuitive analysis, we aim to represent 
deformation with a basis containing interpretable elements, and we wish to use elements that have 
the description capacity to represent the deformation compactly. To accomplish this, we introduce in 
this paper higher-order momentum distributions in the LDDMM registration framework. While the 
zeroth order moments previously used in LDDMM only describe local displacement, the first-order 
momenta that are proposed here represent a basis that allows local description of afiine transfor- 
mations and subsequent compact description of non-translational movement in a globally non-rigid 
deformation. The resulting representation contains directly interpretable information from both 
mathematical and modeling perspectives. We develop the mathematical construction of the registra- 
tion framework with higher-order momenta, we show the implications for sparse image registration 
and deformation description, and we provide examples of how the parametrization enables registra- 
tion with a very low number of parameters. The capacity and interpretability of the parametrization 
using higher-order momenta lead to natural modeling of articulated movement, and the method 
promises to be useful for quantifying ventricle expansion and progressing atrophy during Alzheimer's 
disease. 
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1. Introduction. In many image registration applications, we wisli to describe 
the deformation using as few parameters as possible and with a representation that 
allows intuitive analysis: we search for paramctrizations with basis elements that have 
the capacity to describe deformation sparsely while being directly interpretable. For 
instance, we wish to use such a representation to compactly describe the progressive 
atrophy that occurs in the human brain suffering from Alzheimer's disease and that 
can be detected by the expansion of the ventricles [HI [13] . 

Image registration algorithms often represent translational movement in a dense 
sampling of the image domain. Such approaches fail to satisfy the above goals: low 
dimensional deformations such as expansion of the ventricles will not be represented 
sparsely; the registration algorithm must optimize a large number of parameters; and 
the expansion cannot easily be interpreted from the registration result. 

In this paper, we use higher- order momentum distributions in the LDDMM regis- 
tration framework to obtain a deformation parametrization that increases the capacity 
of sparse approaches with a basis consisting of interpretable elements. Wc show how 
the higher-order representation model locally affine transformations, and we use the 
compact deformation description to register points and images using very few param- 
eters. We illustrate how the deformation coded by the higher-order momenta can be 
directly interpreted and that it represents information directly useful in applications: 
with low numbers of control points, we can detect the expanding ventricles of the 
patient shown in Figure 11.11 

1.1. Background. Most of the methods for non-rigid registration in medical 
imaging model the displacement of each spatial position by either a combination of 
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(a) Baseline with control points. 



(b) Follow up (box marking zoom area, figure 

(c) and (d)). 
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(c) log-Jacobian in ventricle area. 



(d) Initial deformation field in ventricle area. 



Fig. 1.1. Progressing Alzheimer's disease cause atrophy and expansion of the ventricles. By 
placing five deformation atoms in the ventricle area of the baseline MRI scan \20\l and by using 
higher-order momenta, we can detect the expansion, (a) The position of the deformation atoms 
shown in the baseline scan; (b) the follow up scan; (c) the log-Jacobian determinant of the generated 
deformation in the ventricle area (red box in (b)); (d) the vector field at t = of the generated 
deformation. The logarithm of the Jacobian determinant and the divergence at the deformation 
atoms are positive which is in line with the expected ventricle expansion, confer also Figure [V. 5[ 



suitable basis functions for the displacement itself or for the velocity of the voxels. 
The number of control points vary between one for each voxel [2l ll7[ [7] and fewer with 
larger basis functions [ini[Sl[TT]. For all methods, the infinite-dimensional space of 
deformations is approximated by the finite- but high-dimensional subspace spanned 
by the parametrization of the individual method. The approximation will be good 
if the underlying deformation is close to this subspace, and the representation will 
be compact, if few basis functions describe the deformation well. The choice of ba- 
sis functions play a crucial role, and we will in the rest of the paper denote them 
deformation atoms. Two main observations constitute the motivation for the work 
presented in this paper: 

Observation 1: Order of the Deformation Model. In the majority of registration 
methods, the deformation atoms model the local translation of each point. We wish 
a richer representation which is in particular able to model locally linear components 
in addition to local translations. The Polyaffine and Log-Euclidean Polyaffine [31 
[T] frameworks pursue this by representing the velocity of a path of deformations 
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locally by matrix logarithms. Ideas from the Polyafhne methods have recently been 
incorporated in e.g. the Demons algorithm [31] but, to the best of our knowledge, not 
in the LDDMM registration framework. We wish to extend the set of deformation 
atoms used in LDDMM to allow representation of first- and higher-order structure and 
hence incorporate the benefits of the Polyaffine methods in the LDDMM framework. 

Observation 2: Order of the Similarity Measure. When registering DT images, 
the reorientation is a function of the derivative of the warp; curve normals also contain 
directional information which is dependent on the warp derivative and airway trees 
contain directional information in the tree structure which can be used for measuring 
similarity. These are examples of similarity measures containing higher-order infor- 
mation. For the case of image registration, the warp derivative may also enter the 
equation either directly in the similarity measure |24[|22[ or to allow use of more image 
information than provided by a sampling of the warp. Consider an image similarity 
measure on the form U{(p) ~ J^F{Imi^~^ix)),If{x))dx. A finite sampling of the 
domain Q can approximate this with 

1 ^ 

Letting {pi, . . . ,pp} be uniformly distributed points around 0, we can increase the 
amount of image information used in U'^{ip) without additional sampling of the warp 
by using a first-order approximation of ^p~^: 

^ N N 
k=l 1 = 1 

This can be considered an increase from zeroth to first-order in the approximation of 
U . Besides including more image information than provided by the initial sampling 
of the warp, the increase in order allows capture of non-translational information - 
e.g. rotation and dilation - in the similarity measure. The approach can be seen as a 
specific case of similarity smoothing; more examples of smoothing in intensity based 
image registration can be found in [5]. 

We focus on deformation modeling with the Large Deformation Diffeomorphic 
Metric Mapping (LDDMM) registration framework which has the benefit of both 
providing good registrations and drawing strong theoretical links with Lie group the- 
ory and evolution equations in physical modeling [HI [Sj . Most often, high-dimensional 
voxel-wise representations are used for LDDMM although recent interest in compact 
representations [11] [28] show that the number of parameters can be much reduced. 
These methods use interpolation of the velocity field by deformation atoms to repre- 
sent translational movement but deformation by other parts of the affine group cannot 
be compactly represented. 

The deformation atoms are called kernels in LDDMM. The kernels are centered at 
different spatial positions and parameters determine the contribution of each kernel. 
In this paper, we use the partial derivative reproducing property [35] to show that 
partial derivatives of kernels fit naturally in the LDDMM framework and constitute 
deformation atoms along with the original kernels. In particular, these deformation 
atoms have a singular higher- order momentum and the momentum stays singular 
when transported by the EPDiff evolution equations. We show how the higher-order 
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momenta allow modeling locally affine deformations, and they hence extend the capac- 
ity of sparsely discretized LDDMM methods. In addition, they comprise the natural 
vehicle for incorporating first-order similarity measures in the framework. 

1.2. Related Work. A number of methods for non-rigid registration have been 
developed during the last decades including non-linear elastic methods |21] , parametriza- 
tions using static velocity fields HHH], the demons algorithm [521132, and spline-based 
methods [551 [S]- For the particular case of LDDMM, the groundbreaking work ap- 
peared with the deformable template model by Grenander |16| and the flow approach 
by Christensen et al. [7] together with the theoretical contributions of Dupuis et 
al. and Trouve [TUl 130]. Algorithms for computing optimal diffeomorphisms have 
been developed in [3], and |31j uses the momentum representation for statistics and 
develops a momentum based algorithm for the landmark matching problem. 

Locally afhne deformations can be modeled using the Polyaffine and Log-Euclidean 
Polyaffine [3l [1] frameworks. The velocity of a path of deformations is here computed 
using matrix logarithms, and the resulting diffeomorphism flowed forward by integrat- 
ing the velocity. Ideas from the Polyaffine methods have recently been incorporated in 
e.g. the Demons algorithm [351 [55|. In LDDMM, the deformation atoms, the kernels, 
represent translational movement and the non-translational part of afhne transforma- 
tions cannot directly be represented. We will show how partial derivatives of kernels 
constitute deformation atoms which allow representing the linear parts of affine trans- 
formations. From a mathematical point of view, this is possible due to the partial 
derivative reproducing property (Zhou |35]). The partial derivative reproducing prop- 
erty, partial derivatives of kernels, and first-order momenta have previously been used 
in [6] to derive variations of flow equations for LDDMM DTI registration, in [14| to 
match landmarks with vector features, and in |15| to match surfaces with currents. 
Confer the monograph [34] for information on RKHSs and their role in LDDMM. 

In order to reduce the dimensionality of the parametrization used in LDDMM, 
Durrleman et al. introduced a control point formulation of the registration prob- 
lem by choosing a finite set of control points and constraining the momentum to be 
concentrated as Dirac measures at the point trajectories. As we will see, higher-order 
momenta make a finite control point formulation possible which is different in impor- 
tant aspects. Younes [33] in addition considers evolution in constrained subspaces. 

Higher-order momenta increase the capacity of the deformation parametrization, 
a goal which is also treated in sparse multi-scale methods such as the kernel bundle 
framework [28] . This method concerns the size of the kernel in contrast to the order 
which we deal with here. As we will discuss in the experiments section, the size 
of the kernel is important when using the higher-order representation as well, and 
representations using higher-order momenta will likely complement the kernel bundle 
method if applied together. 

1.3. Content and Outline. We start the paper with an overview of LDDMM 
registration and the mathematical constructs behind the method. In the following 
section, we describe registration using higher-order image information and parame- 
terization using higher-order momentum distributions. We then turn to the mathe- 
matical background of the method and describe the evolution of the momentum and 
velocity fields governed by the EPDiff evolution equations in the first-order case. The 
next sections describes the relation to polyaffine approaches, the effect of varying 
the initial conditions, and the backwards gradient transport. We then provide ex- 
amples and illustrate how the deformation represented by first-order atoms can be 
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interpreted when registering human brains with progressing atrophy. The paper ends 
with concluding remarks and outlook. 

2. LDDMM Registration, Kernels, and Evolution Equations. In the LD- 

DMM framework, registration is performed through the action of diffeomorphisms on 
geometric objects. This approach is very general and allows the framework to be ap- 
plied to both landmarks, curves, surfaces, images, and tensors. In the case of images, 
the action of a diffeomorphism Lp on the image / : £7 — > M takes the form ip.I = Ioip~^ , 
and given a fixed image If and moving image /„ , the registration amounts to a search 
for Lp such that Lp.Im ^If- In exact matching, we wish (p.Irn be exactly equal to // 
but, more frequently, wc allow some amount of inexactness to account for noise in the 
images and allow for smoother diffeomorphisms. This is done by defining a similarity 
measure U{ip) = U{ip.Im, If) on images and a regularization measure Ei to give a 
combined energy 

E{^)=Ei{^) + XU{^.I^,If) . (2.1) 

Here A is a positive real representing the trade-off between regularity and goodness 
of fit. The similarity measure U is in the simplest form the L^-error \ip.Imix) — 
If{x)\'^dx but more advanced measures can be used (e.g. [531 HUH]). 

In order to define the regularization term Ei , we introduce some notations in the 
following: Let the domain be a subset of R'^ with d = 2,3, and let V denote a 
Hilbcrt space of vector fields u : f2 — !> R'' such that V with associated norm || • \\v is 
included in L'^{il,R'^) and admissible [34l Chap. 9], i.e. sufficiently smooth. Given a 
time-dependent vector field 1 1— >■ vt with 

/ ||wtl|^dt<oo (2.2) 
Jo 

the associated differential equation dt^pt = Vt o ft has with initial condition ip^ a 
diffeomorphism ip'^^ as unique solution at time t. The set Gy of diffeomorphisms 
built from V by such differential equations is a Lie group, and V is its tangent space 
at the identity. Using the group structure, V is isomorphic to the tangent space at 
each point ip £ Gy- The inner product on V associated to a norm || • \\v makes 
Gy a Riemannian manifold with right-invariant metric. Setting ip'^Q = Ida, the map 
1 (pgt is a path from Ida to ip with energy given by p.2p and generated by vt- We 
will use this notation extensively in the following. A critical path for the energy p.2p 
is a geodesic on Gy , and the regularization term Ei is defined using the energy by 

Ei{ip) = min / \\vs\\y ds , (2.3) 

"teV,v}n='P Jo 

i.e. it measures the minimal energy of diffeomorphism paths from Idn to ip. Since 
the energy is high for paths with great variation, the term penalizes highly varying 
paths, and a low value of Ei(ip) thus implies that ip is regular. 

2.1. Kernel and Momentum. As a consequence of the assumed admissibility 
of V, the evaluation functionals 6x ■ v v{x) 6 M'* is well-defined and continuous 
for any x G ft. Thus, for any z £ R'' the map z ® 6x : w H> z'^v{x) belongs to the 
topological dual V*, i.e. the continuous linear maps on V. This in turn implies the 
existence of spatially dependent matrices K : ft x ft K''^'', the kernel, such that, 
for any constant vector z £ R'', the vector field K{-,x)z G V represents z ^ Sx and 
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{K{-,x)z,v)y = z ® 5x{v) for any w e point a; G and vector z G . This 
latter property is denoted the reproducing property and gives V the structure of a 
reproducing kernel Hilbert space (RKHS). Tightly connected to the norm and kernels 
is the notion of momentum given by the linear momentum operator L : V ^ V* C 
L^in^W^) which satisfies 



(iw,w)i2(n^Rd) = {Lv{x)) w{x)dx = {v,w)y 

for all v,w £ V. The momentum operator connects the inner product on V with the 
inner product in L^(f2,]R'^), and the image Lv of an element v G V is denoted the 
momentum of v. The momentum Lv might be singular and in fact L{K{-^y)z^{x) is 
the Dirac measure 5y{x)z. Considering K as the map z i— >■ J^^K{-,x)z(x)dx, L can 
be viewed as the inverse of K. We will use the symbol p for the momentum when 
considered as a functional in V* while we switch to the symbol z when the momentum 
is realized as a vector field on 51 or for the parameters when the momentum consists 
of a finite number of singular point measures. 

Instead of deriving the kernel from V, the opposite approach can be used: build V 
from a kernel, and hence impose the regularization in the framework from the kernel. 
With this approach, the kernel is often chosen to ensure rotational and translational 

II _ ||2 

invariance [34] and the scalar Gaussian kernel K(x,y) = exp(— )Idrf is an often 

used choice. Confer [T2 for details on the construction of V from Gaussian kernels. 

2.2. Optimal Paths: The EPDiff Evolution Equations. The relation be- 
tween norm and momentum leads to convenient equations for minimizers of the energy 
(|2.ip . In particular, the EPDiff equations for the evolution of the momentum zt for 
optimal paths assert that if (ft is a path minimizing Ei((p) with ipi = ip minimizing 
E{ip) and Vt is the derivative of ipt then vt satisfies the system 



vt ^ K{-, x)zt{x)dx , 

■^Zt = -Dztvt - ZfV ■ Vt - {Dvt)'^zt 

with Dzt and Dvt denoting spatial differentiation of the momentum and velocity 
fields, respectively. The first equation connects the momentum Zt with the velocity 
wt, and the second equation describes the time evolution of the momentum. In the 
most general form, the EPDiff equations describe the evolution of the momentum 
using the adjoint map. Following [34) . define the adjoint Kdipv{x) = (Dipv) o ip^^[x) 
for V G V. The dual of the adjoint is the functional Ad* on the dual V* of V defined 
by {Ad*^p\v) = {p\Ad^{v))E Define in addition Ad^v ^ A'(Ad* (iu)) which then 
satisfies ^Ad^w, vu^ ~ (Ad* (L?;)|u)), and let V<^[/ denote the gradient of the similarity 

measure U with respect to the inner product on V so that (V,^[/, v)y = d^U (%l)Q^oLp) for 
any variation v gV and diffeomorphism path ^/^q^ with derivative v. For optimal paths 
Vti the EPDiff equations assert that vt = Ad^„^ v\ with v\ = — iV^" V which leads 

Y^fl ^ '^01 

to the conservation of momentum property for optimal paths. Conversely, the EPDiff 
equations reduce to simpler forms for certain objects. For landmarks a;i, . . . , xat, the 



^ Here and in the following, we will use the notation (j>\v) := p{v) for evaluation of the functional 
p G V* on the vector field v S: V. 
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momentum will be coneentrated at point trajectories ^ := (pt{xk) as Dirac measm'es 
Zt,k ® ^xt k leading to the finite dimensional system of ODE's 

N ^ 

vt = ^Ki-,xt,i))zt,i , -^^Pt{xk) = vt{xt.k) , 

-T^^t,k = - 'Y^V iK{xt,i,Xt^k)zJ^kZt,i ■ 
1=1 

3. Registration with Higher-Order Information. We here introduce higher- 
order momentum distributions for registration using higher-order information with 
the LDDMM framework. We start by motivating the construction by considering the 
approximation used when computing the similarity measure. We then describe the 
deformation encoded by higher-order momenta and the evolution equations in the 
finite case, and we use this to derive a registration algorithm using first-order infor- 
mation. The mathematical background behind the method will be presented in the 
following sections. 



We will motivate the introduction of higher-order momenta by considering a spe- 
cific case of image registration: we take on the goal of using a control point formulation 
[llj when solving the registration problem ()2.ip and hence aim for using a relatively 
sparse sampling of the velocity or momentum field. To achieve this, we will consider 
the coupling between the transported control points {ip^^{xi), . . . , (p^^(a;jv)} and the 
similarity measure in order to ensure the momentum stays singular and localized at 
the point trajectories while removing the need for warping the entire image at every 
iteration of the optimization process. 

Considering a similarity measure U{{p) = /^^ F{I„i{'^^^{x)), If{x))dx as discussed 
in the introduction, and a finite discretization JJ^^if) = 1/N X^fcLi F{^-Im{xk), If{xk)) 
with a sparse set of control points {xk\. While using U'^((p) to drive registration of 
the images will be very efficient in evaluating the warp in few points, it will suffer 
correspondingly from only using image information present in those points. Apart 
from not being robust under the presence of noise in the images, the discretization 
implies that local dilation or rotation around the points (p~^{xk) cannot be detected: 
any variation v Q V of ip keeping ip~^(xk) constant for all fc = 1, . . . , A'^ will not change 
U^{(p). Formally, if tpQ^^ is a diffeomorphism path that is equal to at t = and has 
derivative u at t = 0, i.e. 9eV'0e — v and t/'qo = </5, then 

deF{il;o^.I„i{xk),If{xk)) ^ diF{ip.I„i{xk),If{xk)) ■ (V^-i(^^)/,„)'^w((p"^(a:fe)) 

which vanishes if v{(p~^{xk)) = 0. Here diF denotes the derivative of F : — ^ R 
with respect to the first variable. 

A simple way to include more image information in the similarity measure is to 
convolve with a kernel Kg , and thus extend C/° to 

1 ^ f 

C^^(<<5) = — ^ C/f^ / ^'>iP + ^k,Xk)F{(p.Im{p + Xk),Ifip + Xk))dp 
k=l ■^^^ 



with a normalization constant. If Kg is a box kernel, this amounts to a finer 
sampling of both the image and warp, and hence a finer discretization of the Riemann 
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integral. The kernel Kg should not be confused with the RKHS kernel connected to 
the norm on V that is used when generating the X^-gradient. A Gaussian kernel may 
be used for Ks, and more information on using smoothing kernels for intensity based 
image registration can be found in [9l |36] . 

The measure U^{lp) is problematic since a variation of ip would affect not only 
the point tp~^{xk) but also ip.Imip + ^k), and U^{ip) will therefore be dependent on 
f-Im{p+Xk) for any p where Ks{p, Xk) is non-zero. In this situation, the momentum is 
no longer concentrated in Dirac measures located at ip^^{xk), and it will be necessary 
to increase the sampling of the warp. However, a first-order expansion of Lp~^ yields 
the approximation 

1 ^ f 

= — ^CK, / K,{p+Xk,Xk)F{I^{D^^^-^p+ip-^{xk))Jf{p+Xk))dp . (3.1) 
fc=i •'^^ 

The measure U^{lp) is now again local depending only on ip~^{xk) and the first-order 
derivatives Dx^ip~^. It offers the stability provided by the convolution with Kg, 
and, importantly, variations v of (p keeping (p~^{xk) constant but changing Dxf.f~^ 
do indeed affect the similarity measure. This implies that [/^((ys) is able to catch 
rotations and dilations and drive the search for optimal Lp accordingly. Please note 
the differences with the approach of Durrleman et al. [11]: when using U^{ip) as 
outlined here, the need for flowing the entire moving image forward is removed and 
the momentum field will stay singular directly thus removing the need for constraining 
the form of the velocity field. 

3.1. Evolution and Deformation with Higher-Order Information. The 

dependence on Dip in the similarity measure U^{p) raises the question of how to 
represent variations of Dp in the LDDMM framework. As we will outline here, higher- 
order momenta appear as the natural choice for such a representation that keeps the 
benefits of the finite control point formulation. Mathematical details will follow in 
the next sections. 

Recall the reproducing property of the RKHS structure, i.e. {K{-,x)z,v)y = 
z (g) bx{v) for w € y, a: € and z e . Let us define the maps z ® Z?" : F ^- R 
that extend the Diracs z ® bxiv) by measuring the derivative of v at x. These will be 
denoted higher-order Diracs, and we say that the momentum distribution is of higher- 
order if it is a sum of higher order Diracs. When applying the momentum operator L 
to the higher-order Diracs. we will get partial derivatives D"K of the RHKS kernel 
K. 

In particular, we will sec that when using similarity measures such as U^{p), the 
momentum field will be a linear combination of higher-order Diracs and the velocity 
field will; correspondingly, be a linear combination of partial derivatives of K. This 
will imply that the finite dimensional system of ODE's p.4p describing the EPDiff 
equations in the landmark case will be extended so that the velocity vt will contain 
partial derivatives D^K. In the first-order case, we will get the velocity 

N d 

v{-) = {K{;Xi)zi + Y.^'^^-^'=i)'l) (3.2) 

1=1 j=l 

where Zi denotes the coefficients of the Dirac measures as in p.4|) but now the ad- 
ditional vectors zf denote the coefficients of the first-order Diracs zf (g) ZJj. for each 
of the d dimensions j = 1, . . . ,d. We will later show how these coefficients evolve. 
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(a) The RHKS kernel encodes (b) Ensembles of kernels can (c) Derivatives of the kernel 
local translation. approximate locally aflino de- directly encode locally affine 

formation. deformation. 

Fig. 3.1. The deformation encoded by the kernel: (a) the RHKS kernel, here a Gaussian of 
scale 8 in grid units, encodes local translation; (b) locally affine deformation, here expansion, can be 
approximated by placing kernels close together. When moving these kernels infinitesimally close, the 
derivative of the kernel arises in the limit, and (c) the derivative encode locally affine deformation 
directly. With higher-order momenta, we will use derivatives of the kernel as deformation atoms. 

Combined with knowledge of how variations of Zi and zl affect the system, we can 
transport variational information along the optimal paths specified by the EPDiff 
equations and thus provide the necessary building blocks for a first-order registration 
algorithm. 

Figure 13.11 illustrates how the local translation encoded by the kernel is com- 
plemented by locally affine deformation when incorporating first-order momenta and 
corresponding partial derivatives of the kernel. Using the language of deformation 
atoms, the first-order constructions adds partial derivatives of kernels to the usual set 
of atoms, and the deformation atoms are thus able to compactly encode expansion, 
contraction, rotation etc. We can directly interpret the coefficients of the first-order 
momenta as controlling the magnitude of these first-order deformations. In Figure [7?T] 
in the experiments section, we give additional illustrations of the deformation encoded 
by the new atoms. 

3.2. Algorithm for First-Order Registration. In this section, we will derive 
a registration algorithm for similarity measures incorporating first-order information 
such as U^{ip). Since the algorithm works for general first-order measures, we will 
again let U denote the similarity measure with U^{(p) being just a particular example. 

There exists various choices of optimization algorithms for LDDMM registration. 
Roughly, they can be divided into two groups based on whether they represent the 
initial momentum/velocity or the entire path ipt- Here, we take the approach of 
incorporating first-order momenta with the shooting method of e.g. Vaillant et al. 
[3Tj . The algorithm will take a guess for the initial momentum, integrate the EPDiff 
equations forward, compute the similarity measure gradient ^U{ip), and flow the 
gradient backwards to provide an improved guess. 

The registration problem ()2.1|) consists of both the similarity measure and the 
minimal path energy Ei. For e.g. landmark based registration, the similarity U{ip) is 
most often expressed in terms of ip directly whether as the similarity measure is usually 
dependent on the inverse (p^^ for image registration. In the first case, the gradient 
V^C/ is known, and, given the initial momentum zg, we can obtain the gradient VzqC/ 
for a gradient descent based optimisation procedure from the backwards transport 
equations that we derive in Section[Bl For the energy part, it is a fundamental property 
of critical paths in the LDDMM framework that the energy stays constant along the 
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path. Thus, wc can easily compute the gradient from the expressions provided in 
Section m Given this, the zeroth order matching algorithm in the initial momentum 
is generalized to zeroth and first-order momenta in Algorithm [TJ 



Algorithm 1 Matching with Zeroth and First-Order Momenta. 
zq initial guess for initial momentum 
repeat 

Solve EPDiff equations forward 
Compute similarity U 
Solve backwards the transpose equations 
Compute the energy gradient VjlfolP 
Update Zq from V||vo|P + Vz^fJ 
until convergence 



Traditionally, the similarity measure U{ip) is in image matching formulated using 
the inverse of ip, and this approach was taken when formulating the approximation 
U^{(p) in p.l|) . For this reason, at finite control point formulation is naturally ex- 
pressed using a sampling {a;i, . . . ,xn} in the target image with the algorithm opti- 
mizing for the momentum zi at time t ~ 1. The evaluation points ip~^{xk) are then 
generated by flowing backwards from t = 1 to t = 0, and the gradient of U{ip) can 
be computed in ip~^{xk) before being flowed /orwards to update zi. Algorithm [T] will 
accommodate this situation by just reversing the integration directions. The control 
points can be chosen either at e.g. anatomically important locations, at random, or on 
a regular grid. In the experiments, we will register expanding ventricles using control 
points placed in the ventricles. 

3.3. Numerical Integration. The integration of the flow equations can be 
performed with standard Runge-Kutta integrators such as Matlabs ode45 procedure. 
With zeroth order momenta only and N points, the forward and backwards system 
consist of 2dN equations. With zeroth and first-order momenta, the forward system 
is extended to N{2d -f d^) and the backwards system to 2N{d -I- d^). For d = 3, this 
implies an 2.5 times increase in the size of the forward system and 4 times increase 
in the backwards system. As suggested in Figure 13.11 the first-order system can be 
approximated using ensembles of zeroth-order atoms. Such an approximation would 
for c? = 3 require at least four zeroth-order atoms for each first-order atom making 
the size of the approximating system equivalent to the first-order system. Due to the 
non-linearity of the systems, the effect of the approximation introduced with such as 
an approach is not presently established. 

In addition to the increase in the size of the systems, the extra floating point op- 
erations necessary for computing the more complicated evolution equations should be 
considered. The additional computational effort should, however, be viewed against 
the fact that the finite dimensional system can contain orders of magnitude fewer 
control points, and the added capacity of deformation description included in the 
derivative information. In addition and in contrast to previous approaches, we trans- 
port the similarity gradient only at the control point trajectories, again an order of 
magnitude reduction of transported information. 

4. Higher-Order Momentum Distributions. We now link partial derivatives 
of kernels to higher-order momenta using the derivative reproducing property, and we 
provide details on the EPDiff evolution equations that we outlined in the previous 
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section. Wc underline that the analytical of structure of LDDMM is not changed 
when incorporating higher-order momenta, and the evolution equations will thus be 
particular instances of the general EPDiff equations. These equations in Hamiltonian 
form constitutes the explicit expressions that allows implementation of the registration 
algorithm. 

We will restrict to scalar kernels when appropriate for simplifying the notation. 
Scalar kernels are diagonal matrices where all diagonal elements are equal. Thus, we 
can consider K{x,y) both a matrix and a scalar so that the entries _ftr| ix,y) of the 
kernel in matrix form equals the scalar K{x, y) if and only if i = j and otherwise. 

4.1. Derivative Reproducing Property. Recall the reproducing property of 
the RKHS structure, i.e. {K{-,x)z,v)y = z ® 5x{v) iov v € V , x & and z G K'^. 
Zhou |35] shows that this property holds not only for the kernel but also for its partial 
derivatives. Letting D"v denote the derivative of w G at x G with respect to the 
multi-index a, 

Q\a\ 

^> = JfTi nS7^(^) 

O^i • ■ • O^d 

and defining {D^Kz){y) = D^{K{-,y)z) for z G M'*, Zhou proves that D-^Kz G V 
and that the partial derivative reproducing property 

{D^Kz,v)y = z^D:{v) (4.1) 

holds when the maps in V are sufficiently smooth for the derivatives to exist. We 
denote the maps z (X) Z?" : y — > M defined by z -D"(w) := z^D^v higher-order 
Diracs, and we say that the momentum distribution is of higher order if it is a sum 
of higher-order Diracs. It follows that 

z^D^ = {v^ {D^Kz,v)y) eV* . 

As a consequence, we can connect higher-order momenta and partial derivatives D"K 
of the kernel. Recall that the momentum map L : V ^ V* satisfies (Lw,w)^2 = 
{v,w)y. With the higher-order momenta, 

{LD^Kz, v)^^ = {D^Kz, v)y = z® D'^{v) = {z ® D^,v)^, . 

Thus LD^Kz = z (E) D" or, shorter, LD"K = I?". That is, partial derivatives of the 
kernel and higher-order momenta corresponds just as the kernels and Diracs measures 
in the usual RKHS sense. 

Consider a map on diffeomorphisms U : Gy K e.g. an image similarity measure 
dependent on Lp. In a finite dimensional setting with evaluation points x^, U would 
decompose as U{ip) = P o Q{ip) with Q{f) = {(p{xi), . . . ,lp{xn)) and P : R'^^ — >■ 
M. Introducing higher-order momenta, we let Q(</j) = {D"^{ip), . . . ,D"-^{(p)) with J 
multi-indices aj, and decompose U as U{(p) = P o Q{ip) with P : M.'^^'^ — M. We 
allow aj to be empty and hence incorporate the standard zeroth order case. The 
partial derivative reproducing property now lets us compute the ^-gradient of U as 
a sum of partial derivatives of the kernel. 

Proposition 4.1. Let V'^-'P denote the gradient with respect to the variable 
indexed by DxHf) in the expression for Q. Then the gradient VipU E V of U with 
respect to the inner product in V is given by ~ X^feLi ^'^'^q(^)^- 
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Proof. The gradient V^J7 at ip is defined by {V^U^v) = dJJ{ev + (p) for all 
variations v & V . For such u, we get using (|4.ip that 

a,C/(ei; + ^) = o Q(ei, + ^) = aeP(Z??^(ei. + ^)) = d,P{eD^lv + D^l^) 

N J I N J \ 

fc=lj = l \A;=li=l / 



4.2. Momentum and Energy. As a result of Proposition UTTJ the momentum 
of the gradient of U is LV ~ X^^i ^Q{^p)^ ® -^"fc ■ general, if u G ^ is 

represented by a sum of higher-order momenta, the energy can be computed 

using (|4.ip as a sum of partial derivatives of the kernel evaluated at the points Xk- 
To keep the notation brief, we restrict to sums of zeroth and first-order momenta in 
the following. If v{-) = Y.k=i {^{xk, ■)zk + Y.'j=i D'K{xk, ■)zl) , we get the energy 

/ N d N d \ 

Ml = ( E (^(^''' >^ + E D'K{xu, -H) , E (^(^fc' + E ^'^(^^' ) 

\fc=l j = l k=l 3 = 1 I y 

N N d 

= ^ {K{xu-)zuK{xk,-)zk)v^ E E (D'K{xi,-)zJ,D'K{xk,-)zl)^ 

k,l=l k,l=lj,i=l 
N d 

k,i=i j=i 

N d d 

= {zfK{xuXk)zk + DlD{K{xuXk)zl +2Y,zlD{K{xuXt,)zl) 

k,l=l j,i=l j=l 

(4.2) 

with D^^K{-, •) denoting differentiation with the respect to the gth variable, g = 1,2, 
and jth coordinate, j ~ 1, . . . ,d. For scalar symmetric kernels, this expression reduces 
to 



N d 

{zf K{xi,Xk)zk+ ^ [D2S/iK{xi,Xk)y^ 

k,l=l 



+ 2Y,(yiK{xuxi)yzlzl) . 

J=l 

4.3. EPDiff Equations. It is important to note that higher-order momenta 
offer a convenient representation for the gradients of maps U incorporating derivative 
information but since the partial derivatives of kernels are members of V and the 
higher order momentum in the dual V* , the analytical of structure of LDDMM is not 
changed. In particular, the adjoint form of the EPDiff equations, i.e. that optimal 
paths vt satisfy Vt = KA^^^vi with vi = —^^fiJJ-, is still valid. The momentum 
pi = Lvi is transported to the momentum pt by Ad*i.^pi. Because 

{ptH = (piiAd^.^H) = {p,\iDip^,,w)oi^-,y') , 
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if pi is a sum of higher-order Diraes, pt wih be sum of higher-order Diraes for ah t. 
However, since the time evolution of {pt\w) with the above relation involves derivatives 
of Dip'^i, this form is inconvenient for computing pt- Instead, we make use of the 
Hamiltonian form of the EPDiff equations [34l P. 265]. Here, the momentum pt 
is pulled back to po but with a coordinate change of the evaluation vector field: 
the Hamiltonian form nt is defined by (ptlw) := [po\{D(pQ^)~^{y)w{y)) ^ where the 
subscript stresses that (I?(^Q()~^(y)u>(y) is evaluated as a y-dependent vector field. 
To simplify the notation, we write just ipt instead of tpQ^. Using this notation, the 
evolution equations become 

d 

dm{y) = ^ {pt\K'{ift{x),'ft{y)))^ej 

(4.3) 

{dtp.t\w) ^ -^{fit\{pt\D2K^{'ft{x),'ft{y))w{y))^ej)^ . 
i=i 

The system forms an ordinary differential equation describing the evolution of the 
path and momentum [34) when (polw) does not involve derivatives of w, e.g. when po 
and hence pt is a vector field zt and the first equation therefore is an integral 

dmiy) ^ / K{(pt{y),(pt{x))zt{x)dx . 



For the higher-order case, we will need to incorporate additional information in the 
system. 

Again we restrict to finite sums of zeroth and first-order point measures, and we 
therefore work with initial momenta on the form 

N N d 

k=l k=l j=l 

with usual denoting the point positions ipt{xi) at time t. Then 

{pt\w) ^ {po\Dipt{y)~^w{y))^ 

f N N d 

= / {Y.^"''^®^^o..+Y.Y.''ik®D^.,„,'^D^t{yy^w{y)dy 
•^^ fe=i fe=ii=i 

N d 

= H {{D^t{xo^ky^'^zo,k + {DW^t{xo,k)~^f z^ ,;} ® S,„ Jw) 
k=l j=l 

N d 
k=l3 = l 

showing that pt = ® Sxo,k + Y.k=i Ej=i ^^ik ® ^^^.^^^ with 

d 

pt,k = D(pt{xo^k)~^''^ ZQ^k + Y Diptixo,k)~^) ^o.fc 



(4.5) 



p^ J. = D(pi(xo,fc) ' zj 



0,k 
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The momentum pt can the be recovered as 

N N d 

{pt\w) = {pt\wOLpt) = {^p.t,k®5^o.k + ^J'ik® ^^o.k)w ° 

k=l k=l j=l 

N N d 

= f^t,k «> w + X! X! tA'lDw{D^ipt){xo.k) 

k=l k=l j=l 

N N d d 

= l^t,k ® 5^,,,w + CY^D"pt){xo,k)Vuk) ® D^^^ .w 

k=l k=l j=l i=l 

and hence the coefficients of the momentum zt^k and j, (confer (j4.4p ) are given by 

zt,k = IJ-t,k and J, = I]iLi(-DVt)(2;o,fe)"'M^- We note that both z{j. and filj. are 
coordinate vectors of the first-order parts of the momentum in ordinary and Hamil- 
tonian form respectively. For each point k and time t, these coordinate vectors thus 
represent two d x d tensors. 

4.4. Time Evolution. Even though ^t,k in (|4.5p depend on the second order 
derivative of ip^ we will show that the complete evolution in the zeroth and first-order 
case can be determined by solving for the points ipt{xkfi), the matrices Dipt{xk,o), 
and the vectors ^J.t,k■ This will provide the computational representation we will use 
when implementing the systems. 

Using (|4.3p . (pt evolves according to 

d „ N d 

i=l •'^ k=l 3 = 1 

d N d 

^YJ2i'^lkK\Mxo,k),My)) + Y'^tjDi^'^'i'Ptixo,k^ ■ 
i=i k=i j=i 

With scalar kernels, the trajectories Xt^k are given by 

N d 

dtPt{xo,k) = Y i^ifti^o.i), 'Pt{xo,k))fJ-t,i + Y '^iK(pt{xo,i),pt{xo.k))'^ 'Pt{xo.i)nl i) ■ 
1=1 j=i 

It is shown in [S!] that the evolution of the matrix Dipt{xkfi) is governed by 

d 

dtDipt{y)a = ^ {pt\D2K\pt{x),pt{y))Dipt{y)a) ^e^ . 

i=l 

Inserting the Hamiltonian form of the higher-order momentum, each component (r, c) 
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(row/column) of the matrix Dtpt(y) thus evolves according to 



. N d 

■'^ k=i j=i 

N 

fe=i 

N d d 
k=l j=l 1=1 

With scalar kernels, the evolution at the trajectories is then 

N 

dtDtpt{xo^kT = E {^■2K{lpt{xo,l),(pt{xo,k))'^D''^pt{xo^k)^^t,l 
1=1 

d 

+ E {Diy2K{ipt{xo.i), Vtixo,k))D^ iptixo^i))'^ D''ipt{xQ^k)fJ-li^ ■ 
i=i 

The complete derivation of the evolution of fit is notationally heavy and can be 
found in the supplementary material for the paper. Combining the evolution of fit 
with the expressions above, we arrive at the following result: 

Proposition 4.2. The EPDiff equations in the scalar case with zeroth and first- 
order momenta are given in Hamiltonian form by the system 

N d 

dt^PtixoM) = E {^i^i,i^^t.k)i^td + \7iK{xtj,xtM)'^D^ipt{xoj)fil i) 
1=1 j=i 

N 

dtDipt{xo,kY E {^2K{xtd,xt^k)'^ D''ipt{xo,k)fJ-t,i 
1=1 

d 



3 = 1 

N 



(4.6) 



- ^ {Di\72K{xt,i,Xt,k)D'(pt{xo,i)) D''Lpt{xQ^k)lJ-l^ 

3 = 1 

dtfJ-t,k = - E {^{t^ukfJ-t,i)^2K{xtd,xt,k) 
1=1 
d 

+ E {t^tj: f^t,l) D2V2K{xt,l, XtM)D^ ¥'t{xo,k) 

i=i 

d 

+ E {t^I.kt^ii) Di'^^K (xtj, XtM)D^ Vt{xo,i) 
i=i 

d 

+ E (MCf/^li)^2(AV2A'(xtj,Xt,fc)-D-'(pt(xoj))i^''Vt(a^o,fc; 

Note that both xi^k ~ ^oii^o.k) and DipQ-^^{x()^k) are provided by the system 
and hence can be used to evaluate a similarity measure that incorporates first-order 
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information. As in the zcroth order case, the entire evolution can be recovered by the 
initial conditions for the momentum. 

5. Locally Affine Transformations. The PolyafSne and Log-Euclidean Poly- 
afhne [21 [J frameworks model locally affinc transformations using matrix logarithms. 
The higher-order momenta and partial derivatives of kernels can be seen as the LD- 
DMM sibling of the PolyafEne methods, and diffeomorphism paths generated by 
higher-order momenta, in particular, momenta of zcroth and first-order, can locally 
approximate all affine transformations with linear component having positive deter- 
minant. The approximation will depend only on how fast the kernel approaches zero 
towards infinity. The manifold structure of Gy provides this result immediately. In- 
deed, let ^p{x) — Ax -I- 6 be an affine transformation with det(^) > 0. We define a 
path ipt of finite energy such that ipi Ki Lp which shows that (pi G Gy and can be 
reached in the framework. The matrices of positive determinant is path connected 
so we can let ijjt be a path from Idd to A and define "iptix) = iptX + ht. Then with 
vt{x) = {dt'>pt)'ipr^ (x) + b, we have dti!t{x) = idt'>pt)x + b = Vt o iptix) and 

x+ Vt o tpt{x)dt — X + / {dtipt)x + bdt ^ Lp{x) . 
Jo Jo 

Now use that {dtipt)'4>i~^ (x) = {dtil^t){'^t)~^{x — bt) and let the Mt = (wi.t . . -rnd.t) be 
the <-dependent matrix {dt4't){'^t)~^ so that the first term of Vt{x) equals Mt{x — bt). 
Then choose a radial kernel, e.g. a Gaussian K„, and define the approximation Vt of 
Vt by 

d 

vt{x) = E^^„(o)^-(-^)"'j'* + KMM,x)b . (5.1) 

The path ipl^i generated by vt then has finite energy, and 

Voiix)=x+ ft o(p^j(x)dt w (^(a;) 
Jo 

with the approximation depending only on the kernel scale a. Note that the affine 
transformations with linear components having negative determinant can in a similar 
way be reached by starting the integration at a diffeomorphism with negative Jacobian 
determinant. 

In the experiments section, we will illustrate the locally affine transformations 
encoded by zeroth and first-order momenta, and, therefore, it will be useful to intro- 
duce a notation for these momenta. We encode the translational part of either the 
momentum or velocity using the notation 

TsUb) = K,{x,-)b 

and the linear part by 

d 

Uu,{M) = Y,DiKA:)m, 

i=i 

with mi, rrij being the columns of the matrix A/. Equation (jS.ip can then be written 



vt{x) = Lin^,^(o) (Mt) + Tsl^,^(o)(6) 



(5.2) 
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Wc emphasize that though we mainly focus on zeroth and first-order momenta, the 
mathematical construction allows any order momenta permitted by the smoothness 
of the kernel at order zero. 

6. Variations of the Initial Conditions. In Algorithm [1] we used the varia- 
tion of the EPDiff equations when varying the initial conditions and in particular the 
backwards gradient transport. We discuss both issues here. 

A variation 5po of the initial momentum will induce a variation of the system 
(|4.6[) . By differentiating that system, we get the time evolution of the variation. To 
ease notation, we assume the kernel is scalar on the form K{x,y) = ^{\x — yp) and 
write jt,ik = K{xt.i,xt^k)^ Variations of the kernel and kernel derivatives such as the 
entity SViK^xt^i^Xt^k) below depend only on the variation of point trajectories Sxt^i 
and Sxt^k- The full expressions for these parts are provided in supplementary material 
for the paper. The variation of the point trajectories in the derived system then takes 
the form 

N 

dtSipt{xo,k) = ^ {SK{xt.i,xt.k)fJ.t.i + 7t,ikSfJ.t.i) 
1=1 

N d 

+ {^^i^i^t,hXt,k)'^ D\t{xo,i)fJ'li + '^iK{xt^i,Xt,k)'^6D^(pt{xo,i)fJ.li 

1=1 j=i 

+ V iK{xtj,xt^k)'^ ipt{xo,i)5^ili) 

The similar expressions for the evolution of S^t f^ and dDipt{xo,k) are provided in the 
supplementary material. The variation of /i^ j, is available as 

^^^ik = ~{D(pt{xo^ky'^dDift{xQ^k)D'ft{xo^ky'^)'^ z^f^i^ + Dipt{xo^ky'^'^Szlj^ . 

However, when computing the backwards transport, we will need to remove the de- 
pendency on (5zg J. which is only available for forward integration. Instead, by writing 
the evolution of /i^ f, in the form 

^t^^ik = dtDipt{xo^ky^'^ z^f. = -{DLpt{xo^ky^dtD(pt{xo,k)Dtpt{xo^ky^)'^ z^j^ 
= -Dipt{xo^k)~^'^dtDipt{xo^k)^ nl f, , 
we get the variation 

'^t^^^t,k = -SDiptixo^ky^-'^dtDipt{xo^k)'^^j,l f^ ~ Diptixo^ky^-'^ dtSDiftixQ^k)"^ filj, 
- D(pt{xo^k)~^''^dtDLpt{xo_k)'^Sfil ^. . 

6.1. Backwards Transport. The correspondence between initial momentum 
Pq and end diffeomorphism ipQi asserted by the EPDiff equations allows us to view the 
similarity measure U{(pQi) as a function of po- Let A denote the result of integrating 
the system for the variation of the initial conditions from t = Q to t = 1 such that 
w — ASpo G V for a variation 5po. We then get a corresponding variation 6U in the 
similarity measure. To compute the gradient of C/ as a function of po, we have 



^The subscript notation is used in accordance with |34) . Please note that -yt ik contains three 
separate indices, i.e. the time t and the point indices I and k. 
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Thus, the y*-gradicnt of '^pgU is given by A'^Vip^_^U. The gradient can equivalently 
be computed in momentum space at both endpoints of the difFeomorphism path using 
the map P defined in Proposition 14. II 

The complete system for the variation of the initial conditions is a linear ODE, 
and, therefore, there exists a time-dependent matrix Mt such that the ODE 

dtVt = Mtyt 

has the variation as a solution yt- It is shown in [31] that, in such cases, solving the 
backwards transpose system 

dtwt = -Mjwt (6.1) 

from t = 1 to t = provides the value of A^w. Therefore, we can obtain V p^lJ by 
solving the transpose system backwards. The components of Mt can be identified 
by writing the evolution equations for the variation in matrix form. This provides 
MJ and allows the backwards integration of the system [O] The components of the 
transpose matrix Mt are provided in the supplementary material for the paper. 

7. Experiments. In order to demonstrate the efficiency, compactness, and in- 
terpretability of representations using higher-order momenta, we perform four sets of 
experiments. First, we provide four examples illustrating the type of deformations 
produced by zeroth and first-order momenta and the relation to the Polyaffine frame- 
work. We then use point based matching using first-order information to show how 
complicated warps that would require many parameters with zeroth order deforma- 
tion atoms can be generated with very compact representations using higher-order 
momenta. We underline the point that higher-order momenta allow low-dimensional 
transformations to be registered using correspondingly low-dimensional representa- 
tions: we show how synthetic test images generated by a low-dimensional transfor- 
mation can be registered using only one deformation atom when representing using 
first-order momenta and using the first-order similarity measure approximation (|3.ip . 
We further emphasize this point by registering articulated movement using only one 
deformation atom per rigid part, and thus exemplify a natural representation that 
reduces the number of deformation atoms and the ambiguity in the placement of the 
atoms while also reducing the degrees of freedom in the representation. Finally, we 
illustrate how higher-order momenta in a natural way allow registration of human 
brains with progressing atrophy. We describe the deformation field throughout the 
ventricles using few deformation atoms, and we thereby suggest a method for detecting 
anatomical change using few degrees of freedom. In addition, the volume expansion 
can be directly interpreted from the parameters of the deformation atoms. We start 
by briefly describing the similarity measures used throughout the experiments. 

For the point examples below, we register moving points xi, . . . , xjv against fixed 
points j/i, . . . , j/AT. In addition, we match first-order information by specifying values 
of D^^ip. This is done compactly by providing matrices Yj, so that we seek D^^f = Y;^ 
for all fc = 1, . . . , A'^. The similarity measure is simple sum of squares, i.e. 

TV 
i=l 

using the matrix 2-norm. This amounts to fitting (p against a locally affine map with 
translational components yu and linear components Yfc. For the image cases, we use 
i^-similarity to build the first-order approximation p.ip with the smoothing kernel 
Ks being Gaussian of the same scale as the LDDMM kernel. 
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(a) Expansion (b) Contraction 




(c) Rotation (— 7r/2) (d) Two rotations (it/2) 

Fig. 7.1. The effect of the generated deformation on an initially square grid for several initial 
first-order momenta: Using the notation of section\5i (a) expansion po = Lino(Id2); (b) contraction 
po = Lino(— Id2); (c) rotation po = Lino(Rot(i')), v = —-k 12; (d) two rotations v = it 12. The kernel 
is Gaussian with a = 8 in grid units, and the grids are colored with the trace of Cauchy-Green strain 
tensor (log-scale). Notice the locality of the deformation caused by the finite scale of the kernel, and 
that the deformation stays diffeomorphic even when two rotations force conflicting movements. 

7.1. First Order Illustrations. To visually illustrate the deformation gener- 
ated by higher-order momenta, we show in Figure 17.11 the generated deformations 
on an initially square grid with four different first-order initial momenta. The de- 
formation locally model the linear part of affine transformations and the the locality 
is determined by the Gaussian kernel that in the examples has scale cr = 8 in grid 
units. Notice for the rotations that the deformation stays diffeomorphic in the pres- 
ence of conflicting forces. The similarity between the examples and the deformations 
generated in the Polyaffine framework |lj underlines the viewpoint that the registra- 
tion using higher-order momenta constitutes the LDDMM sibling of the Polyaffine 
framework. 

7.2. First Order Point Registration. Figure \T72\ presents simple point based 
matching results with first-order information. The lower points (red) are matched 
against the upper points (black) with match against expansion D^{xk) = 2Id2 and 

rotation D^^ixh) = Kot(v) = ( '^'-'^(y)'^^^(^) j f^j. y _ =F7r/2. The optimal dif- 
^ ^ ' \~sm{v),cos{v)J ^ ' ^ 

feomorphisms exhibit the expected expanding and turning effect, respectively. We 

stress that the deformations are generated using only two deformations atoms with 
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(a) Match with dilations (expansion) 



(b) Match with rotations (— 7r/2 and 7r/2) 



Fig. 7.2. Two moving points (red) are matched against two fixed points (black) with results 
(green) and with match against (a) expansion = 2Id2, i = 1,2; and (b) rotation D^ixfS) = 

Rot(ti), V = ^7r/2, i = 1,2. The kernel is Gaussian with a = 8 in grid units, and the grids are 
colored with the trace of Cauchy-Green strain tensor (log-scale). 



combined 12 parameters. Representing equivalent deformation using zcroth order 
momenta would require a significantly increased number of atoms and a correspond 
increase in the number of parameters. 

7.3. Low Dimensional Image Registration. We now exemplify how higher- 
order momenta allow low-dimensional transformations to be registered using corre- 
spondingly low-dimensional representations. We generate two test images by applying 
two linear transformations, an dilation and a rotation, to a binary image of a square, 
confer the moving images (a) and (e) in Figure [7751 By placing one deformation atom 
in the center of each fixed image and by using the similarity measure approxima- 
tion (j3.ip . we can successfully register the moving and fixed images. The result and 
difference plots are shown in Figure 17.31 The dimensionality of the linear transfor- 
mations generating the moving images is equal to the number of parameters for the 
deformation atom. A registration using zeroth order momenta would need more than 
one deformation atom which would result in a number of parameters larger than the 
dimensionality. The scale of the Gaussian kernel used for the registration is 50 pixels. 

7.4. Articulated Motion. The articulated motion of the fingeiH in Figure \7A\ 
(a) and (b) can be described by three locally linear transformations. With higher- 
order momenta, we can place deformation atoms at the center of the bones in the 
moving and fixed images, and use the point positions together with the direction of 
the bones to drive a registration. This natural and low dimensional representation 
allows a fairly good match of the images resembling the use of the Polyaffine affine 
framework for articulated registration j26j . A similar registration using zeroth order 
momenta would need two deformation atoms per bone and lacking a natural way 
to place such atoms, the positions would need to be optimized. With higher-order 
momenta, the deformation atoms can be placed in a natural and consistent way, and 



X-ray frames from |http : //www, archive . org/detalls/X-raystudlesof thejolntmovements-wellcome] 
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(a) Moving image 



(b) Fixed image (c) Registration result 



(d) Difference 






r 









(e) Moving image 



(f) Fixed image 



(g) Registration result 



(h) Difference 



Fig. 7.3. With linear transformations, the dimensionality of the higher-order representation 
matches the dimensionality of the transformation. A dilation (e) and rotation (d) is applied to the 
fixed binary images (b) and (f), respectively. The registration results (c) and (g) subtracted from 
the fixed images are shown in the difference pictures (d) and (h). The registration is performed 
with a single first-order momenta in the center of the pictures, and the number of parameters for 
the registration thus matches the dimensionality of the linear representations. The slight differences 
between results and fixed images are caused by the first- order approximation in 1 13.11 1. Increasing the 
kernel size, adding more control points, or using second order momenta would imply less difference. 




(a) Moving 



(b) Fixed 



(c) Result, first-order 



Fig. 7.4. Registering articulated movement using directional information of the bones: the 
landmarks and bone orientations (red points and arrows) in the moving image (a) are matched 
against the landmarks and bone orientations (green points and arrows) in the fixed image (b). The 
result using first-order momenta (c) can be obtained with a low number of deformation atoms that 
can be consistently placed at the center of the bones. A corresponding zeroth order representation 
would use a higher number of atoms with a corresponding increase in the number of parameters. 



the total number of free parameters is lower than a zeroth order representation using 
two atoms per bone. 

7.5. Registering Atrophy. Atrophy occurs in the human brain among patients 
suffering from Alzheimer's disease, and the progressing atrophy can be detected by 
the expansion of the ventricles |191 I13| . Since first-order momenta offer compact 
description of expansion, this makes a parametrization of the registration based on 
higher-order momenta suited for describing the expansion of the ventricles, and, in 
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addition, the deformation represented by the momenta will be easily interpretable. In 
this experiment, we therefore suggest a registration method that using few degrees of 
freedom describes the expansion of the ventricles, and does so in a way that can be 
interpreted when doing further analysis of e.g. the volume change. 

We use the publicly available Oasis dataselQ [50], and, in order to illustrate the 
use of higher-order momenta, we select a small number of patients from which two 
baseline scans are acquired at the same day together with a later follow up scan. The 
patients are in various stages of dementia, and, for each patient, wc rigidly register 
the two baseline and one follow up scan j^]. 

The expanding ventricles can be registered by placing deformation atoms in the 
center of the ventricles of the fixed image as shown in Figure [Ol For each patient, 
we manually place five deformation atoms in the ventricle area of the first baseline 3D 
volume. It is important to note that though we localize the description of the defor- 
mation to the deformation atoms, the atoms control the deformation field throughout 
the ventricle area. Based on the size of the ventricles, we use 3D Gaussian kernels 
with a scale of 15 voxels, and we let the regularization weight in (j2.1[) be A = 16. 
The effect of these choices is discussed below. Each deformation atom consists of a 
zcroth and first-order momenta. We use similarity to drive the registration [9pl 
and, for each patient, we perform two registrations: we register the two baseline scans 
acquired at the same day, and we register one baseline scan against the follow up scan. 
Thus, the baseline-baseline registration should indicate no ventricle expansion, and we 
expect the baseline- follow up registration to indicate ventricle expansion. Figure 11.11 
shows for one patient the placement of the control points in the baseline image, the 
follow up image, the log-Jacobian determinant in the ventricle area of the generated 
deformation, and the initial vector field driving the registration. 

The use of first-order momenta allows us to interpret the result of the registrations 
and to relate the results to possible expansion of the ventricles. The volume change is 
indicated by the Jacobian determinant of the generated deformation at the deforma- 
tion atoms as well as by the divergence of the first-order momenta. The latter is avail- 
able directly from the registration parameters. We plot in Figure [73] the logarithm of 
the Jacobian determinant and the divergence for both the same day baseline-baseline 
registrations and for the baseline-follow up registrations. Patient 1 — 4 are classified 
as demented, patient 5 and 6 as non-demented, and all patient have constant clinical 
dementia rating through the experiment. The time-span between baseline and follow 
up scan is 1.5-2 years with the exception of 3 years for patient four. As expected, the 
log-Jacobian is close to zero for the same day baseline-baseline scans but it increases 
with the baseline-follow up registrations of the demented patients. In addition, the 
correlation between the log-Jacobian and the divergence shows how the indicated vol- 
ume change is related directly to the registration parameters; the parameters of the 
deformation atoms can in this way be directly interpreted as encoding the amount of 
atrophy. 

We chose two important parameters above: the kernel scale and the regularization 
term. The choice of one scale for all patients works well if the ventricles to be registered 
are of approximately the same size at the baseline scans. If the ventricles vary in size, 
the scale can be chosen individually for each patient. Alternatively, a multi-scale 
approach could do this automatically which suggests combining the method with e.g. 
the kernel bundle framework j27) . Depending on the image forces, the regularization 



|http: //www, oasis-brains ■ org] 

^See also |http : //image . diku.dk/darkner/LOl] 
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(a) The average log-Jacobian of the final de- 
formation at the 5 deformation atoms for the 
baseline-baseline and baseline-follow up regis- 
trations 



1 2 3 4 5 6 

demenled non-demented 

(b) The average divergence at the deformation 
atoms for the baseline-baseline and baseline- 
follow up registrations 



Fig. 7.5. Indicated volume change: (a) The average log-Jacobian determinant of the generated 
deformation at the 5 deformation atoms for six patients (1-4 demented, 5-6 non- demented); (b) 
divergence of the 5 higher-order momenta representing the deformation. The divergence can be 
extracted directly from the parameters of the first-order momenta, and the correlation between the 
log-Jacobian and the divergence as seen by the similarity between (a) and (b) therefore shows the 
interpretability of the deformation atoms. The time-span between baseline and follow up scans are 
1.5-2 years with the exception of 3 years for patient four (arrows). 



term in (j2.ip will affect the amount of expansion captured in the registration. Because 
of the low number of control points, we can in practice set the contribution of the 
regularization term to zero without experiencing non-diffeomorphic results. It will be 
interesting in the future to estimate the actual volume expansion directly using the 
parameters of the deformation atoms with this less biased model. 

8. Conclusion and Outlook. We have introduced higher-order momenta in the 
LDDMM registration framework. The momenta allow compact representation of lo- 
cally afhne transformations by increasing the capacity of the deformation description. 
Coupled with similarity measures incorporating first-order information, the higher- 
order momenta improve the range of deformations reached by sparsely discretized 
LDDMM methods, and they allow direct capture of first-order information such as 
expansion and contraction. In addition, the constitute deformation atoms for which 
the generated deformation is directly interpretable. 

We have shown how the partial derivative reproducing property implies singular 
momentum for the higher-order momenta, and we used this to derive the EPDiff 
evolution equations. By computing the forward and backward variational equations, 
we are able to transport gradient information and derive a matching algorithm. We 
provide examples showing typical deformation coded by first-order momenta and how 
images can be registered using a very few parameters, and we have applied the method 
to register human brains with progressing atrophy. 

The experiments included here show only a first step in the application of higher- 
order momenta: the representation may be applied to register entire images; merging 
the method with multi-scale approaches will increase the description capacity and may 
lead to further reduction in the dimensionality of the representation. Combined with 
efficient implementations, higher-order momenta promise to provide a step forward in 
compact deformation description for image registration. 
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